Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(utils): linear scan is_uri #2648

Merged
merged 1 commit into from
Aug 28, 2023
Merged

perf(utils): linear scan is_uri #2648

merged 1 commit into from
Aug 28, 2023

Conversation

jamestrew
Copy link
Contributor

Benchmarked against 80k+ unique paths, both Windows and Linux (macos should be close enough to llinux).

Approximately 13% faster for linux paths and 55% faster for windows paths.

hyperfine 'luajit current_linux.lua' 'luajit fast_linux.lua' --warmup 10
Benchmark 1: luajit current_linux.lua
  Time (mean ± σ):      25.7 ms ±   0.8 ms    [User: 22.7 ms, System: 3.4 ms]
  Range (min … max):    24.0 ms …  28.8 ms    87 runs

Benchmark 2: luajit fast_linux.lua
  Time (mean ± σ):      22.7 ms ±   0.8 ms    [User: 19.4 ms, System: 3.6 ms]
  Range (min … max):    21.4 ms …  26.0 ms    99 runs

Summary
  luajit fast_linux.lua ran
    1.13 ± 0.05 times faster than luajit current_linux.lua

hyperfine 'luajit current_win.lua' 'luajit fast_win.lua' --warmup 10
Benchmark 1: luajit current_win.lua
  Time (mean ± σ):      36.3 ms ±   0.9 ms    [User: 34.1 ms, System: 2.7 ms]
  Range (min … max):    34.7 ms …  38.9 ms    69 runs

Benchmark 2: luajit fast_win.lua
  Time (mean ± σ):      23.5 ms ±   0.7 ms    [User: 21.1 ms, System: 2.9 ms]
  Range (min … max):    22.4 ms …  25.7 ms    95 runs

Summary
  luajit fast_win.lua ran
    1.55 ± 0.06 times faster than luajit current_win.lua

@juntuu
Copy link

juntuu commented Aug 23, 2023

Hi, just bumped into this while browsing.

I'm not at all familiar with the project, so can't say if the performance improvement would outweigh the complexity cost, and what would be the effect in real workloads.

However, if I'm not mistaken this change introduces two differences in the matching:

  1. does not require a : in uri (is_uri("hello") == true)
  2. the windows "drive letter" can be more than one letter (is_uri("hello:\\") == false)

Because the function is checking for uri and not windows path, I think 2. is more correct now. Uri should not contain \ anyway.

But 1. seems like a bug.

I think the following (pseudocode) should fix 1. while keeping 2.:

if filename[1] not in [a-zA-Z] then
  return false
end
for i = 2, #filename do
  if filename[i] == ':' then
    return filename[i+1] ~= '\\'
  elseif filename[i] not in [a-zA-Z0-9.+-] then
    return false
  end
end
return false

@max397574
Copy link
Contributor

max397574 commented Aug 23, 2023

@juntuu that pseudocode won't work afaict since there is no chance the elseif statement ever is true
it would be true if the character is a colon but if that's the case it will already return false because of the if statement

@juntuu
Copy link

juntuu commented Aug 23, 2023

that pseudocode won't work

Oops, good catch. Thanks!

I flipped the conditions now, so the : is checked first.

@jamestrew
Copy link
Contributor Author

Good catches to both!
I guess under ideal conditions, something like is_uri('hello') or really anything that's not a full path shouldn't happen but for the sake of correctness, I like the suggested change.

No significant performance difference between my original implementation and the suggested change.

$ hyperfine 'luajit before_linux.lua' 'luajit after_linux.lua' --warmup 10 --runs 100
Benchmark 1: luajit before_linux.lua
  Time (mean ± σ):      27.2 ms ±   0.5 ms    [User: 22.6 ms, System: 4.5 ms]
  Range (min … max):    26.4 ms …  29.9 ms    100 runs

Benchmark 2: luajit after_linux.lua
  Time (mean ± σ):      27.5 ms ±   0.8 ms    [User: 22.8 ms, System: 4.7 ms]
  Range (min … max):    26.0 ms …  30.5 ms    100 runs

Summary
  luajit before_linux.lua ran
    1.01 ± 0.03 times faster than luajit after_linux.lua
$ hyperfine 'luajit before_win.lua' 'luajit after_win.lua' --warmup 10 --runs 100
Benchmark 1: luajit before_win.lua
  Time (mean ± σ):      26.9 ms ±   0.5 ms    [User: 23.6 ms, System: 3.6 ms]
  Range (min … max):    26.0 ms …  28.8 ms    100 runs

Benchmark 2: luajit after_win.lua
  Time (mean ± σ):      25.9 ms ±   0.7 ms    [User: 22.9 ms, System: 3.3 ms]
  Range (min … max):    24.8 ms …  28.9 ms    100 runs

Summary
  luajit after_win.lua ran
    1.04 ± 0.03 times faster than luajit before_win.lua

@jamestrew
Copy link
Contributor Author

I'm going to go ahead and merge this in. I think considering this is a pretty hot function, the performance gain, particularly on Windows is worth the diminished readability.

@jamestrew jamestrew merged commit 1dfa66b into master Aug 28, 2023
13 checks passed
@jamestrew jamestrew deleted the is_uri_cleanup branch August 28, 2023 04:26
Conni2461 pushed a commit that referenced this pull request Sep 5, 2023
(cherry picked from commit 1dfa66b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants